Cooperative federation of digital devices via proxemics and device micro-mobility

ABSTRACT

The subject disclosure is directed towards co-located collaboration/data sharing that is based upon detecting the proxemics of people and/or the proxemics of devices. A federation of devices is established based upon proxemics, such as when the users have entered into a formation based upon distance between them and orientation. User devices may share content with other devices in the federation based upon micro-mobility actions performed on the devices, e.g., tilting and/or otherwise interacting with a sending device.

BACKGROUND

Despite the ongoing proliferation of useful digital devices havingvarious form-factors such as slates and electronic whiteboards, suchtechnology often hinders (rather than helps) interactions between smallgroups of people. Whereas natural human conversation is fluid anddynamic, discussions that rely on digital content, such as slides,documents, and clippings, often remain stilted and unnatural due to theawkwardness of manipulating, sharing, and displaying information on andacross multiple devices.

SUMMARY

This Summary is provided to introduce a selection of representativeconcepts in a simplified form that are further described below in theDetailed Description. This Summary is not intended to identify keyfeatures or essential features of the claimed subject matter, nor is itintended to be used in any way that would limit the scope of the claimedsubject matter.

Briefly, various aspects of the subject matter described herein aredirected towards a technology in which a server is configured to processimage data to determine a formation based upon proxemics of users havingdigital devices. The server further maintains federation datacorresponding to which devices are in the formation and communicates thefederation data to the devices to establish a federation between thedevices. This server may reside on one or more of the federated devices,on a host in the local environment, or be distributed in “the cloud”(e.g. among various networked servers hosting data and services).

In one aspect, upon establishing a federation among devices, includingdevices associated with one or more users, (and/or one or more devicesassociated with no users). the federation data indicative of thefederation is maintained and used to determine a set of federateddevices that are able to communicate data between one another. Thecommunication is based upon the federation and one or moremicro-mobility actions.

In one aspect content is received at a recipient device from a sendingdevice, in which the sending device and the recipient device areparticipants in a federation established via proxemics between thedevices. The content is rendered on the recipient device, includingrendering the content in partial, increasing amounts over time untilfully rendered. The rendering in partial, increasing amounts occurs fromone direction based upon the relative locations of the devices to havethe content appear to have come from a direction of the sending devicerelative to the recipient device.

Other advantages may become apparent from the following detaileddescription when taken in conjunction with the drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and not limitedin the accompanying figures in which like reference numerals indicatesimilar elements and in which:

FIG. 1 is a block diagram showing an example architecture that allowscontent sharing/collaboration among federated devices based uponproxemics and micro-mobility actions detected at devices, according toone or more example embodiments.

FIG. 2 is a representation of how users and devices may be sensed toestablish and maintain a federation, according to one or more exampleembodiments.

FIG. 3 is a representation how a formation of users may be sensed andused to establish a federation, according to one or more exampleembodiments.

FIGS. 4A-4C are representations of different formations that may besensed and used to establish and maintain a federation, according to oneor more example embodiments.

FIG. 5 is a representation of pipeline that may be used to processcaptured image data into a formation for establishing a federation,according to one or more example embodiments.

FIGS. 6A and 6B are representations of how a user and the user'sorientation may be detected among depth data, according to one or moreexample embodiments.

FIGS. 7A and 7B are representations of how micro-mobility actions (e.g.,different device tilting) may be used to share data, according to one ormore example embodiments.

FIG. 8 is a flow diagram showing how a federation may be established anddynamically maintained, according to one or more example embodiments.

FIG. 9 is a block diagram representing an example environment into whichaspects of the subject matter described herein may be incorporated.

DETAILED DESCRIPTION

Various aspects of the technology described herein are generallydirected towards having mobile devices such as slates and tablets, aswell as situated devices such as desktops, tabletops, and electronicwhiteboards, operate in a manner that is based upon the physicalco-presence (“proxemics”) of nearby persons and devices. To this end,the distance and relative body orientation amongst small groups ofco-located users is sensed, and combined with motion and orientationsensor data (e.g., via a three-axis accelerometer/gyroscope sensor) toallow nuanced and subtle device motions to initiate sharing of content,or overtures to share content, amongst users. The system also maydetermine device identity and coarse-grained proximity informationamongst the devices themselves. Further, multi-touch and pen gesturesinput to the devices, or motion gestures articulated using the devicesthemselves (shaking, tilting, twisting, bumping, etc.), can be used toexplicitly initiate exchanges, and/or parameterize what content (if any)is being offered to other users.

As will be understood, the resulting multi-modal fusion of sensors andinputs allows techniques for sharing amongst devices that areautomatically federated based upon the physical person-to-persondistance, as well as the physical relative body orientation, amongstgroups of co-located users. Examples include passing back and forth ofpieces of content amongst a small circle of users standing together,while excluding other persons who may happen to be nearby, but arefacing away or otherwise excluded (e.g., outside a formation “circle”but facing it) from the small-group formation.

It should be understood that any of the examples herein arenon-limiting. As such, the present invention is not limited to anyparticular embodiments, aspects, concepts, structures, functionalitiesor examples described herein. Rather, any of the embodiments, aspects,concepts, structures, functionalities or examples described herein arenon-limiting, and the present invention may be used various ways thatprovide benefits and advantages in computing in general.

As generally represented in FIG. 1, a server 102 receives images fromdepth sensors or the like (e.g., cameras and other sensors, referred toherein as cameras for simplicity) 104 and 106, and recognizes devices108-110, such as by small radios coupled to each of the devices.Although two cameras 104 and 106 are depicted, it is feasible to haveany practical number of cameras, as well as one or more pan/tilt ormoving cameras, to provide a desired coverage.

Similarly, although three devices 108-110 are depicted, it is feasibleto have from one to any number of devices, up to a practical limit forreasonably-sized “small” groups. Each of the devices runs a softwareprogram (e.g., an application) that coordinates with the server 102,receives input related to communicating information with other userdevices based upon proxemics, and sends data to other user devices basedupon proxemics and the input or receives data from other user devicesbased upon proxemics and input sensed at those devices.

The server 102 processes the camera images to determine whether one ormore people are in the cameras' field of view, as represented by theperson, orientation and formation detection pipeline 112 in FIG. 1;(examples of such stages/algorithms of the pipeline 112 are describedbelow with reference to FIG. 5). If so, the server 102 determines thedistance between the people and the orientations and the directions ofthe people to one another; (a singleton person is also detected). Fixedor semi-fixed features of the environment, such as wall computers,electronic whiteboards, digital tabletops, or even non-interactivefurniture, walls, and dividers, may also be sensed and/or pre-registeredin a database of features in the local context. When a federation 114 isdetected corresponding to appropriate formations, the server 102maintains a list or the like of one or more sets of federated devices,and updates the devices of a federation upon a state change. It is alsopossible for multiple sets of federated devices to overlap one another,e.g., a device may simultaneously be federated with two differentgroups, possibly with different permissions/relationships.

It is also feasible to have a federation with zero or more users. Forexample, devices may be arranged previously by users and maintain somerelationship, even when no users are currently proximal. In thisinstance, the server does not detect people, or at least does notrequire people to be present to maintain the federation, but rather usesthe devices. With one user a federation may change based upon proxemics,e.g. the facing or not facing of a single user to a wall display may bebe used to federate that user's device and the wall display (or not).

In one implementation, the various system components are connected viaTCP over wireless Ethernet, with a message protocol based upon Windows®Communication Framework (WCF) to transmit application and sensor state.The server 102 maintains a global model of the spatial relationships,and notifies clients of state changes.

As can be readily appreciated, the server may be a central server, oranother model such as a peer model may be used. For example, the devicesmay include server code so that a central server is not needed, e.g.,with one device acting as the server, possibly with assistance fromanother device or devices in the group (e.g., like a distributed serverconcept). Thus, a device may be a client of its own server.

In one implementation generally represented in FIG. 2, the cameras 104and 106 comprise overhead depth (e.g., Kinect™) cameras that allowtracking persons (e.g., 220 and 221), generally in the form of movingblobs, that the server recognizes as people. Because the devices are notrecognized by the cameras, radios are used to associate devices tosensed persons. However, devices with visible tags sensed by the camera,or RF (radio frequency) or other tags sensed by other means, are alsofeasible.

More particularly, as generally represented in FIG. 2, the overheadcameras allow the tracking of moving blobs that are recognized aspeople, while radios, along with wireless radio signal trilateration,associate devices carried by the sensed persons to the individualpersons. In one implementation, the radios need line of sight forcommunication, which is beneficial in that the signals tend to staywithin the social boundaries of meeting spaces delineated by walls,partitions, furniture and the like. Thus, QSRCT radios may be used,which have a maximum range of approximately 15 meters and also senseapproximate distance between radio modules (at 10 cm accuracy at ninetypercent confidence). Three point location (trilateration) is employed byputting three radio nodes (three such nodes 224, 225 and 226 are shownin FIG. 2) at fixed locations in the space around the edges of the areatracked by the depth cameras. A mobile radio on each device (not shown)sends range-finding requests to each fixed node and the resultingmeasurements are Kalman-filtered to reduce noise. The device location isthen the intersection of the three circles.

The server matches the device to a person by assigning the device to theperson who is standing closest (as sensed by the cameras 104 and 106) tothe triangulated location. If the distance between the device and allcurrently tracked people is above a tolerance threshold (currently 1 m),the device remains unassigned. To assign other devices, the process isrepeated with the remaining radios attached to the tablets. Note that anelectronic whiteboard 228 is shown in FIG. 2. Such a device can beconsidered a participant, like a person, although radios need not beused with fixed devices because the location, type and “identity” of thefixed device do not change, at least not often. Further note that othertypes of devices may be participants. Note that a portable device suchas a projector may have a radio, and thus is even more like a person.

Trilateration with the radios and matching of the radio IDs to thedevices held by people tracked via the cameras allows associating eachdevice to a particular person. This also facilitates identifying aperson based on the device he or she carries (under the assumption thateach person carries their own device). As people come and go from thefield-of-view of the cameras, the radios enable tracking who is enteringor leaving, as well as maintaining some awareness of whether the personhas walked away, or still remains nearby but out of depth-camera view.

As can be readily appreciated, other techniques to associate people anddevices may be used. For example, if the participants of a group arerecognized in some way, such as by having to look up to the camera onceto participate or speak their names (to facilitate facial recognition orvoice recognition, respectively), or by sensing identification such asbadges or those with radios/RFID tags or the like, then the system mayaccess a list of device identifier(s) associated with each user. Thesystem may then allow all devices associated with that user or that areregistered by the user with the system for sharing, such as a tablet,cell phone, laptop and any other portable devices). Note that it ispossible that the federated/not federated state of one user's multipledevices may not all be the same, e.g., a user may hold up his tabletwhile his phone is still in his pocket, for example, such that thetablet is fully federated but the phone is not.

Turning to behavioral sensing, namely the tracking of F-formations, asis known F-formations deal with the physical arrangements that peopleadopt when they engage in relatively close, focused conversationalencounters. In general, F-formations consider the spatial relationshipsthat occur as people arrange themselves during personal interaction fortask-dependent communication and clarity of perception.

As represented in FIG. 3, a typical F-formation arrangement is a roughlycircular cluster that contains two to five persons who are actively partof a group. The inner space of that circle (referred to as o-space,labeled “o” in FIG. 3) is reserved for the main activity of the group.The ring of space occupied by the people (p-space, labeled “p” in FIG.3) determines group membership. The surrounding region (r-space, labeled“r” in FIG. 3) buffers the group from the outside world. Thus, personswho are nearby but not in p-space are excluded from the fine-grainedsocial circle that defines the F-formation. At the same time, ther-space is monitored to see if any others are trying to join. Forexample, an approaching person in r-space may be greeted by one or moremembers of the group via eye contact, while a person who is facing away,even if close to the group, is ordinarily not treated as a potentialmember.

F-formations are often nuanced. For example, F-formations need not becircular. Instead, different relative body orientations are typical.Such orientations include face-to-face (FIG. 4A), which typicallycorresponds to competitive tasks. Other common orientations includeside-by-side (FIG. 4B), typically corresponding to collaborative tasks,and corner-to-corner (L-shaped, FIG. 4C) typically corresponding tocommunicative tasks.

In one implementation, the overhead cameras 104 and 106 (e.g., mountedin the ceiling and looking downwards), capture the spatial relationshipsbetween people, including the distance, viewing direction, and relativebody orientation between persons. A mechanism (e.g., an algorithm) asdescribed herein aggregates these sensed persons into F-formations andclassifies the type of assemblage, including face-to-face, side-by-side,corner-to-corner, (or as a singleton that is not a member of anyformation).

Note that with respect to “distance,” distance may be a generalizednotion of a “social distance” and not necessarily a strictly Euclideanlinear distance. The concept of “distance” may take into accountrelative body orientation, as well as boundaries of social spaces (e.g.office walls, partitions between cubicles, etc., if these are known;low-power radio technology may respect most such boundaries due to thefrequency band employed, but also may go through some materials.

In one implementation Kinect™ field-of-view of 58.5 degrees, with thecameras mounted 200 cm above the floor, produces an approximate 220×94cm tracking area per camera. The cameras may be arranged in an L-shapedsensing region (FIG. 2) and, with a one-time calibration step, composethe data into a combined depth image. For each depth image theorthographic projection of the camera view is calculated, with filtersand linear interpolation used to remove noise resulting from the partialoverlap of the cameras' structured light patterns.

One implementation generally represented in FIG. 5 includes theprocessing pipeline 112 for F-Formation detection, in which the combineddepth image 552 may be processed in multiple stages (or steps) toidentify the location, orientation, and formations of people. In thepipeline, a filtering stage 554 filters out connected components thatare too small or too large to be an actual person, leaving just thosecomponents most likely to represent a person; these are consideredpeople thereafter in the example pipeline.

Height normalization 556 normalizes the heights of persons to match. Onealgorithm finds the highest point of the detected candidate peopleidentified in the previous stage, and shifts the depth values of theremaining outlines of people by the difference in height.

A head detection stage 558 detects heads by assuming the topmost depthband (in a separated two-dimensional (2D) image) represents people'sheads. One algorithm identifies the connected components in thisseparated 2D image. The results are ellipsoidal outlines of people'sheads. The major axis of the ellipse is used to determine as theorientation of the head. This is generally shown in FIGS. 6A and 6B,with the top down image “blob” (FIG. 6B) showing the head as an ellipseshaded with vertical stripes.

Another stage 560 of the pipeline calculates body orientation. A seconddepth band (FIG. 6A) includes all regions belonging to people'sshoulders. Each detected shoulder region is assigned to the person towhich it is closest. The convex hull is used (because the shoulder isnot necessarily a single connected component) to get an ellipsoidaloutline of a person's shoulders. The major axis of that ellipse givesthe orientation of the person's body, shown in FIG. 6B as the ellipseshaded with horizontal lines.

The orientations calculated in the head detection and body orientationstages still have a 180-degree ambiguity. To determine which way theuser is facing, a direction stage 562 takes a third depth band thatcorresponds to the user's torso, including their arms and hands (and anydevices), as generally represented in FIG. 6A. The side of the majorbody axis that a person's arms and hands appear on, as well as whichside the head is shifted towards, is taken as the front (shown as theunshaded portion of the blob in FIG. 6B). Hysteresis may be used toprevent momentary flips in the facing vector due to noise.

With respect to detecting F-Formations at a formation stage 564, twopeople can be in an F-formation if: (a) they are not standing behindeach other, (b) the angle between their orientation vectors is smallerthan 180 degrees (otherwise they would face away from each other), and(c) the distance between individuals is small enough so they cancomfortably communicate and their o-spaces (FIG. 3) overlap. Analgorithm iterates over all pairs of people, calculates the distance andangle between them, and assigns an F-formation type (e.g., side-by-side,L-shaped, face-to-face, or none) based on tolerance thresholds.Hysteresis prevents frequent switching of detected formations ifmeasurements lie close to a threshold. Singletons and persons in R-space(outside a formation) are also detected and tagged.

Note that in one system, sensed formations lead to federated groups inwhich the barriers for sharing digital content are lowered. However, onesystem does not necessarily assume that people in a group want to share.Thus, a hybrid approach of sensed F-formations plus devicemicro-mobility may be used to initiate the actual cross-device sharingof content. This approach is directed to contexts such as small-groupmeeting spaces, where people have freedom of movement, but it may not besuitable for crowded environments such as a conference hall or subwaystation. In some embodiments, the system may disable, limit, orotherwise modify the number and scope of operations available toF-formations if a crowded environment is sensed. This may includedeferring shared content to an asynchronous sharing mechanism whichneeds subsequent sign-in (password) or other validation to actuallyaccess the full content.

Note that asynchronous sharing may be used for other purposes. Forexample, users who have no devices associated with them may stillparticipate in a group, (e.g., shared content may be retrievable by themin the cloud, later, in an asynchronous manner).

Described herein are various interaction techniques that facilitate thesharing of digital information between the devices of people standing inan F-formation. By way of example, some of the techniques describedherein are based upon detected micro-mobility gestures; note howeverthat in some embodiments these gestures are only active when the user iscurrently sensed as standing in an F-formation, or even a specific typeof F-formation.

In general, the exemplified interaction techniques facilitate transientsharing, copying, transfer, and reference to digital information acrossfederated devices. More particularly, the exemplified system offersmultiple ways to support co-located collaborative activity, with variousand nuanced semantics of what it means to “share” content. For purposesof explanation, a two-user F-formation involving handheld devices isused in some of the examples, in which the user initiating theinteraction is the sender; and the other person is the recipient. Aswill be understood, the techniques work with more than two people, andalso with a large display (or other fixed or semi-fixed features in thelocal environment).

A first technique is referred to as Tilt-to-Preview, with respect topreviewing selected content. The Tilt-to-Preview technique provides alightweight way to transiently share selected digital content acrossdevices. The receiving user can elect to keep a copy of the transientlyshared information.

While tilting within the o-space, the sender shares a selected piece ofcontent by holding his or her finger on the piece of content whilesimultaneously tipping the tablet device 772 slightly, e.g., by aminimum of ten degrees (FIG. 7A). Tipping a slate beyond this thresholdserves both as a gesture to trigger the behavior as well as a social cueobservable to the recipient (and any other people nearby) that thesender wishes to share something. In two person F-formations, the sendertips the slate towards the other user. Alternatively, or if there aremore than two people in the F-formation, the user can also tip the slatetowards the o-space (i.e., the center of the formation).

Note that the above gesture has the user touching the content whiletilting. Other gestures are feasible, and indeed, a user may configurethe application to relate other gestures to actions. For example,consider that a user already has a piece of content selected (in anysuitable way) for sharing. Instead of or in addition to tilting, theuser may shake the device a certain way, or move it back and forthtowards the recipient or o-space. This allows a user to maintain bothhands on the device. Thus, any of the gestures and movements (e.g.,tilting) described herein are only non-limiting examples.

This gesture is only active when the tablet is held up within o-space.When triggered one implementation causes a transient semi-transparentrepresentation of the selected item to appear on the display of thedevices in the current F-formation, i.e., this federation. To make iteasy for recipients to identify who is offering an item, an animationslides-in the shared item from the side of the screen where the senderis standing, as generally exemplified by the simulated movement 774 viathe arrows and dashed rectangle of content becoming an actual rectangleon the user's device. Note that the recipient can ignore such anoverture by leaving his tablet down, in p-space or accept it by holdingthe tablet up. When the sender lets go of the transiently sharedcontent, it disappears from the recipient's screen. However, therecipient can choose to keep a copy of the transiently shared content bytouching a finger down and grabbing it while it remains visible. Thisprecludes any need for either user to reach onto or towards the other'sdisplay, thereby avoiding persistent spatial invasion.

The content may be transferred through the server 102 or through anotherserver. However, an alternative is to peer-to-peer transfer the contentto a peer device based upon state information that informs the sendingdevice the list of one or more federated devices to receive the content.Furthermore, such sharing may be deferred (e.g. with a reference to thecontent shared in real-time, and the actual content retrieved at a latertime).

Turning to another example, referred to as “Face-to-Mirror” with respectto the full screen, a user can employ a larger tilt as a more demandingnonverbal request to interrupt the current thread of conversation andintroduce something else. Users often employ large tilts to show contentto their more distant partner in face-to-face formations. Face-to-Mirroris directed towards tilting to share a user's full screen view of theprimary digital content displayed on the user's screen to any othertablets in the social group of an F-formation.

When a person holds their tablet vertically (at an angle larger thanseventy degrees), the interactive canvas is mirrored, at full-screenscale, to the display of all other tablets of the group (FIG. 7B). Notethat unlike Tilt-to-Preview, this may be a pure tilting gesture; theuser does not have to touch the screen to explicitly select content.Thus, while the tilting motion is larger, the transaction cost ofsharing is potentially lower because the required action is simply “showyour screen to the others.” The tilting motion is large enough thatincidental tilting is not an issue with this technique. As withTilt-To-Preview, Face-to-Mirror begins as a transient sharing techniquethat ends when the sender moves his slate away from the verticalposture, but where recipients can retain a copy by grabbing a mirroreditem.

The above two techniques share a transient representation of an item, ora permanent copy if the recipient touches down and grabs it. To explorean alternative semantic of transferring content from one device toanother (that is, moving rather than copying content), a Portalstechnique is used. Tilting is used as a micro-mobility gesture toestablish a Portal.

In one implementation, when tilting a tablet (more than ten degrees)towards the device of any other group member, a tinted edge appearsalong the shared screen edge of the two tablets. By dragging an itemthrough this edge and releasing the touch, the item is (permanently)transferred to the other device. A continuous cross-display animationmay be used to reinforce the metaphor of the gesture, namely the contentslides off the sender's screen, and slides into the recipient's screen.The recipient can then drag, resize, and otherwise manipulate thecontent that was transferred to his or her tablet. As withTilt-to-Preview, the recipient only receives items sent through a Portalif the tablet is held up in o-space (as opposed to moving it down top-space).

Note that to an extent the gesture for Portals is basically a hybrid ofTilt-to-Preview and Face-to-Mirror, in that the user performs a fairlysubtle (greater than ten degree) tilting motion (like Tilt-to-Preview)to create the portal, but does not have to touch the screen while doingso. This means that Portals may be more prone to incidental tilting,however the feedback for Portals (a visually unobtrusive tinting alongthe matching edge of the devices) as well as the semantics of using thePortal (a transfer only occurs if the user explicitly passes an itemthrough the shared edge of the Portal) means that there is very littleimpact if accidental activation of a Portal does occur.

Cross-Device Pinch-to-Zoom is another example technique, in which userscan explicitly share items when the tablets are not tilted, but are heldtogether side-by-side in o-space and at the same relatively flat angle.This technique allows viewing content across multiple tablet deviceswhen using a pinch-to-zoom gesture. As typical of freeform canvasinterfaces, a person can use a two finger pinch gesture to enlarge anycontent on the screen. However, with the knowledge of F-formations andthe pose of nearby devices, when the sender enlarges the zoomed contentbeyond the visible area of the slate's display, the remaining contentexpands onto the surrounding tablets in o-space. That is, while theperson zooms in, the content is displayed on the combined screen area ofthe tablets that form a particular f-formation (i.e., a largercross-device virtual canvas is created).

While the above interactions illustrate examples that may apply totwo-person F-formations, the techniques also apply to larger groups. Asdescribed above, for Tilt-to-Preview and Face-to-Mirror, for example, aperson can share content with the entire group by tilting their tablettowards the center of the formation (i.e., towards o-space) rather thanjust tilting towards a single person.

Furthermore, the techniques described above may be implemented for thethree types of F-formations (side-by-side, face-to-face, andcorner-to-corner). Assemblage-specific gestures may be used. Note thatthe feedback on the screen (e.g. placement of the tinting indicating anactive Portal) matches the spatial arrangement of users.

Likewise, users who are sensed as external to the F-formation cannotparticipate in group interactions, unless they move to stand within thegroup.

Turning to another aspect, a device that is not associated with aparticular user may participate in a group, such as having a digitalwhiteboard 228 (FIG. 2) as part of an F-formation. As a result, userswithin a sensed F-formation can share content with the digitalwhiteboard in a manner analogous to sharing content to slates held byother participants. For example, consider the Hold-to-Mirror technique;a person can hold their tablet vertically towards the large display, anda temporary copy of the tablet's content appears on the large screen.Similarly, a person standing next to the whiteboard can use the Portalstechnique to move content to the large display by dragging content ontothe edge of the slate facing towards the whiteboard 228.

Thus, one implementation considers the digital whiteboard in a mannersimilar to the human participants in an F-formation; that is, it has aparticular location and a facing vector. When the digital whiteboardfalls along the perimeter of p-space it is treated as part of theF-formation, but if the digital whiteboard falls outside the huddle, inr-space, it is not. For example, if a circle of users stands in front ofthe digital whiteboard, with one user's back to the display, and anotheruser performs a Face-to-Mirror gesture, the content will be mirrored tothe F-formation but not to the whiteboard. If instead the same circleexpands to encompass the whiteboard, then the content is sent to thewhiteboard as well.

Note that the user application may be configurable to preventinadvertently sending content to such a device, e.g., by providing anicon or the like that the user needs to explicitly select beforeenabling such a transfer. This may prevent content from being viewed byothers outside the group, such as if the whiteboard is visible fromrelatively far away.

FIG. 8 is a flow diagram summarizing some example steps taken by theserver, beginning at step 802 where the images are captured andcombined. Step 804 processes the images to look for a formation, e.g.,using the pipeline described above. Note that there is only a properformation when more than one person (including any participating device)is present at the appropriate distances and orientations, and thus step806 returns to step 802 when there is not a proper formation.

When there is a formation, step 808 represents determining which devicesare in the formation, and associates each device with a person; (exceptfor other devices such as a whiteboard). Step 810 maintains theinformation (e.g., the devices and associated participants) asfederation data for this group.

Step 812 represents evaluating whether the data has changed, e.g.,whether a new user/device has joined the federation, or another has leftrelative to the last state update. Note that the first time a formationis detected, this is a change to the previous state of no formation.Step 814 propagates the state change to each device in the federation.

Note that distance (and relative body orientation) is not necessarilythe same between devices versus that between users. For example, twodevices may be very close to one another, yet move in and out of afederation depending on who the owner is facing/talking to.

As can be seen, there is described the combination of sensing human bodydistance and orientation (proxemics) with sensing device presence andmotions (e.g., subtle tilts) or other interactions that are givenmeaning and significance by a social situation. Coarse-grained relativedevice locations (e.g., distance, as well as angle when triangulation ispossible) may be used based upon low-power radio proximity sensingintegrated with the devices, which establishes device identity and letsthe system maintain awareness of nearby devices that are not in thesensing range of other (e.g., camera) sensors.

The technology includes algorithms and approaches for overhead depthsensing of small group formations and classification into types (e.g.side-by-side, face-to-face, corner-to-corner) with other pairwiserelationships, along with the treatment of singletons. Single touch,multi-touch and pen gestures may be used to provide natural, intuitive,and meaningful gestures for sharing. Small group sensing may be adjustedbased on the presence situated devices such as wall displays/electronicwhiteboards.

EXAMPLE COMPUTING DEVICE

As mentioned, advantageously, the techniques described herein can beapplied to any device. It can be understood, therefore, that handheld,portable and other computing devices and computing objects of all kindsincluding standalone cameras are contemplated for use in connection withthe various embodiments. Accordingly, the below general purpose remotecomputer described below in FIG. 9 is but one example of a computingdevice.

Embodiments can partly be implemented via an operating system, for useby a developer of services for a device or object, and/or includedwithin application software that operates to perform one or morefunctional aspects of the various embodiments described herein. Softwaremay be described in the general context of computer-executableinstructions, such as program modules, being executed by one or morecomputers, such as client workstations, servers or other devices. Thoseskilled in the art will appreciate that computer systems have a varietyof configurations and protocols that can be used to communicate data,and thus, no particular configuration or protocol is consideredlimiting.

FIG. 9 thus illustrates an example of a computing environment 900 inwhich one or aspects of the embodiments described herein (such as theanti-shake correction controller) can be implemented, although as madeclear herein, the computing environment 900 is only one example of asuitable computing environment and is not intended to suggest anylimitation as to scope of use or functionality. In addition, thecomputing environment 900 is not intended to be interpreted as havingany dependency relating to any one or combination of componentsillustrated in the example computing environment 900.

With reference to FIG. 9, an example remote device for implementing oneor more embodiments includes a processing unit 920, a system memory 930,and a system bus 922 that couples various system components includingthe system memory to the processing unit 920.

The environment may include a variety of logic, e.g., in an integratedcircuit chip and/or computer-readable media which can be any availablemedia that can be accessed. The system memory 930 may include computerstorage media in the form of volatile and/or nonvolatile memory such asread only memory (ROM) and/or random access memory (RAM). By way ofexample, and not limitation, system memory 930 may also include anoperating system, application programs, other program modules, andprogram data.

A user can enter commands and information through input devices 940. Amonitor or other type of display device also may be connected to thesystem bus 922 via an interface, such as output interface 950. Inaddition to a monitor, other peripheral output devices such as speakersmay be connected through output interface 950.

The system may be coupled to one or more remote computers, such asremote computer 970. The remote computer 970 may be a personal computer,a server, a router, a network PC, a peer device or other common networknode, or any other remote media consumption or transmission device, andmay include any or all of the elements described above. The logicalconnections depicted in FIG. 9 include a bus such as a USB-basedconnection, or a wireless networking connection. Also, there aremultiple ways to implement the same or similar functionality, e.g., anappropriate API, tool kit, driver code, operating system, control,standalone or downloadable software objects, etc., which enablesapplications and services to take advantage of the techniques providedherein. Thus, embodiments herein are contemplated from the standpoint ofan API (or other software object), as well as from a software orhardware object that implements one or more embodiments as describedherein. Thus, various embodiments described herein can have aspects thatare wholly in hardware, partly in hardware and partly in software, aswell as in software.

The word “example” is used herein to mean serving as an example,instance, or illustration. For the avoidance of doubt, the subjectmatter disclosed herein is not limited by such examples. In addition,any aspect or design described herein as “example” is not necessarily tobe construed as preferred or advantageous over other aspects or designs,nor is it meant to preclude equivalent example structures and techniquesknown to those of ordinary skill in the art. Furthermore, to the extentthat the terms “includes,” “has,” “contains,” and other similar wordsare used, for the avoidance of doubt, such terms are intended to beinclusive in a manner similar to the term “comprising” as an opentransition word without precluding any additional or other elements whenemployed in a claim.

As mentioned, the various techniques described herein may be implementedin connection with hardware or software or, where appropriate, with acombination of both. As used herein, the terms “component,” “module,”“system” and the like are likewise intended to refer to acomputer-related entity, either hardware, a combination of hardware andsoftware, software, or software in execution. For example, a componentmay be, but is not limited to being, a process running on a processor, aprocessor, an object, an executable, a thread of execution, a program,and/or a computer. By way of illustration, both an application runningon computer and the computer can be a component. One or more componentsmay reside within a process and/or thread of execution and a componentmay be localized on one computer and/or distributed between two or morecomputers.

The aforementioned systems have been described with respect tointeraction between several components. It can be appreciated that suchsystems and components can include those components or specifiedsub-components, some of the specified components or sub-components,and/or additional components, and according to various permutations andcombinations of the foregoing. Sub-components can also be implemented ascomponents communicatively coupled to other components rather thanincluded within parent components (hierarchical). Additionally, it canbe noted that one or more components may be combined into a singlecomponent providing aggregate functionality or divided into severalseparate sub-components, and that any one or more middle layers, such asa management layer, may be provided to communicatively couple to suchsub-components in order to provide integrated functionality. Anycomponents described herein may also interact with one or more othercomponents not specifically described herein but generally known bythose of skill in the art.

In view of the example systems described herein, methodologies that maybe implemented in accordance with the described subject matter can alsobe appreciated with reference to the flowcharts of the various figures.While for purposes of simplicity of explanation, the methodologies areshown and described as a series of blocks, it is to be understood andappreciated that the various embodiments are not limited by the order ofthe blocks, as some blocks may occur in different orders and/orconcurrently with other blocks from what is depicted and describedherein. Where non-sequential, or branched, flow is illustrated viaflowchart, it can be appreciated that various other branches, flowpaths, and orders of the blocks, may be implemented which achieve thesame or a similar result. Moreover, some illustrated blocks are optionalin implementing the methodologies described hereinafter.

CONCLUSION

While the invention is susceptible to various modifications andalternative constructions, certain illustrated embodiments thereof areshown in the drawings and have been described above in detail. It shouldbe understood, however, that there is no intention to limit theinvention to the specific forms disclosed, but on the contrary, theintention is to cover all modifications, alternative constructions, andequivalents falling within the spirit and scope of the invention.

In addition to the various embodiments described herein, it is to beunderstood that other similar embodiments can be used or modificationsand additions can be made to the described embodiment(s) for performingthe same or equivalent function of the corresponding embodiment(s)without deviating therefrom. Still further, multiple processing chips ormultiple devices can share the performance of one or more functionsdescribed herein, and similarly, storage can be effected across aplurality of devices. Accordingly, the invention is not to be limited toany single embodiment, but rather is to be construed in breadth, spiritand scope in accordance with the appended claims.

What is claimed is:
 1. A system comprising: a sending device; and arecipient device configured to: receive content from a sending device,the sending device and the recipient device being participants in afederation established by proxemics between the sending device and therecipient device; and render the content in partial, increasing amountsover time until the content is fully rendered, the rendering in partial,increasing amounts occurring from one direction based upon a location ofthe sending device relative to the recipient device to have the contentappear to have come from a direction of the sending device relative tothe recipient device.
 2. The system of claim 1, further comprising aserver configured to: maintain federation data corresponding to: whichdigital devices are in the formation, the digital devices comprising thesending device and the recipient device; and which users associated withthe digital devices are facing a particular direction in the formation;and communicate the federation data to the digital devices in theformation to establish a federation between the digital devices in theformation, the federation between the digital devices in the formationenabling the digital devices in the formation to communicate contentwith one another based upon one or more micro-mobility actions detectedby at least one of the digital devices in the formation, themicro-mobility actions comprising one or more of the following: devicetilting, touch gesture, pen gesture, and multi-touch gesture; andwherein the server dynamically updates the federation data and thedigital devices in the formation upon a change to the formation.
 3. Thesystem of claim 2 wherein the images comprise depth data, and whereinthe server includes a processing pipeline that determines users withinthe depth data and the orientation of the users relative to one another.4. The system of claim 3 wherein the orientation of the users relativeto one another comprises a face to face orientation, a side-to-sideorientation, or a corner-to-corner orientation.
 5. The system of claim 2wherein the server comprises a central server coupled to one or morecameras that provide the image data.
 6. The system of claim 2 furthercomprising a digital device in the federation that is associated withthe federation by proxemics but not associated with any one user.
 7. Thesystem of claim 6 wherein the digital device comprises a digitalwhiteboard or other display mechanism configured to render content. 8.The system of claim 2 further comprising radios having signals that aredetected to associate the digital devices to users.
 9. A methodcomprising: receiving content at a recipient device from a sendingdevice, the sending device and the recipient device being participantsin a federation established by proxemics between the sending device andthe recipient device; and rendering the content on the recipient devicein partial, increasing amounts over time until the content is fullyrendered, the rendering in partial, increasing amounts occurring fromone direction based upon a location of the sending device relative tothe recipient device to have the content appear to have come from adirection of the sending device relative to the recipient device. 10.The method of claim 9, further comprising: determining proxemics among aplurality of devices, the plurality of digital devices comprising one ormore of the following: at least two users having digital devices, atleast one user and at least two digital devices, and at least twodigital devices; establishing a federation between the plurality ofdigital devices; maintaining federation data indicative of thefederation; and using the federation data to determine a set offederated devices that are able to communicate data between one anotherbased upon the federation and one or more micro-mobility actions, themicro-mobility actions comprising one or more of the following: devicetilting, touch gesture, pen gesture, and multi-touch gesture; whereinmaintaining the federation data comprises updating the federation dataupon a change to the federation, and wherein using the federation datato determine the set of federated devices comprises communicating thefederation data to the devices.
 11. The method of claim 10 furthercomprising associating detected users with the devices.
 12. The methodof claim 10 wherein one sending device of the federated devices sharescontent to at least one recipient device of the federated devices basedupon the one or more of the micro-mobility actions, the sharing ofcontent comprising moving the content from the sending device to the atleast one recipient device or copying the content from the sendingdevice to the at least one recipient device.
 13. The method of claim 12further comprising, at a recipient device, saving a copy of the contentshared thereto based upon user interaction with the recipient device.14. The method of claim 12 further comprising detecting a tilting of thesending device prior to sharing the content.
 15. The method of claim 12further comprising, taking one of a plurality of sharing actions basedupon an amount of tilt detected by the sending device.
 16. The method ofclaim 9 further comprising sharing content to one of the plurality ofdevices that displays the content to a plurality of users.
 17. One ormore computer-readable storage media having computer-executableinstructions, which when executed on at least one processor performoperations comprising: receiving content at a recipient device from asending device, the sending device and the recipient device beingparticipants in a federation established by proxemics between thesending device and the recipient device; and rendering the content onthe recipient device in partial, increasing amounts over time until thecontent is fully rendered, the rendering in partial, increasing amountsoccurring from one direction based upon a location of the sending devicerelative to the recipient device to have the content appear to have comefrom a direction of the sending device relative to the recipient device.18. The one or more computer-readable storage media of claim 17 havingfurther computer-executable instructions comprising: detecting, at thesending device, one of a first micro-mobility action or a secondmicro-mobility action; and sharing a full screen of content based on thefirst micro-mobility action or sharing less than a full screen ofcontent based upon the second micro-mobility action.
 19. The one or morecomputer-readable storage media of claim 18, wherein the firstmicro-mobility action and the second micro-mobility action comprises oneor more of the following: device tilting, touch gesture, pen gesture,and multi-touch gesture.
 20. The one or more computer-readable storagemedia of claim 19, wherein a difference between the first micro-mobilityaction and the second micro-mobility action is an amount of tiltdetected by the sending device.